A Self-Adaptive Fuzzy c-Means Algorithm for Determining the Optimal Number of Clusters
نویسندگان
چکیده
For the shortcoming of fuzzy c-means algorithm (FCM) needing to know the number of clusters in advance, this paper proposed a new self-adaptive method to determine the optimal number of clusters. Firstly, a density-based algorithm was put forward. The algorithm, according to the characteristics of the dataset, automatically determined the possible maximum number of clusters instead of using the empirical rule [Formula: see text] and obtained the optimal initial cluster centroids, improving the limitation of FCM that randomly selected cluster centroids lead the convergence result to the local minimum. Secondly, this paper, by introducing a penalty function, proposed a new fuzzy clustering validity index based on fuzzy compactness and separation, which ensured that when the number of clusters verged on that of objects in the dataset, the value of clustering validity index did not monotonically decrease and was close to zero, so that the optimal number of clusters lost robustness and decision function. Then, based on these studies, a self-adaptive FCM algorithm was put forward to estimate the optimal number of clusters by the iterative trial-and-error process. At last, experiments were done on the UCI, KDD Cup 1999, and synthetic datasets, which showed that the method not only effectively determined the optimal number of clusters, but also reduced the iteration of FCM with the stable clustering result.
منابع مشابه
Oil Reservoirs Classification Using Fuzzy Clustering (RESEARCH NOTE)
Enhanced Oil Recovery (EOR) is a well-known method to increase oil production from oil reservoirs. Applying EOR to a new reservoir is a costly and time consuming process. Incorporating available knowledge of oil reservoirs in the EOR process eliminates these costs and saves operational time and work. This work presents a universal method to apply EOR to reservoirs based on the available data by...
متن کاملOPTIMIZATION OF FUZZY CLUSTERING CRITERIA BY A HYBRID PSO AND FUZZY C-MEANS CLUSTERING ALGORITHM
This paper presents an efficient hybrid method, namely fuzzy particleswarm optimization (FPSO) and fuzzy c-means (FCM) algorithms, to solve the fuzzyclustering problem, especially for large sizes. When the problem becomes large, theFCM algorithm may result in uneven distribution of data, making it difficult to findan optimal solution in reasonable amount of time. The PSO algorithm does find ago...
متن کاملADAPTIVE NEURO FUZZY INFERENCE SYSTEM BASED ON FUZZY C–MEANS CLUSTERING ALGORITHM, A TECHNIQUE FOR ESTIMATION OF TBM PENETRATION RATE
The tunnel boring machine (TBM) penetration rate estimation is one of the crucial and complex tasks encountered frequently to excavate the mechanical tunnels. Estimating the machine penetration rate may reduce the risks related to high capital costs typical for excavation operation. Thus establishing a relationship between rock properties and TBM pe...
متن کاملRefining membership degrees obtained from fuzzy C-means by re-fuzzification
Fuzzy C-mean (FCM) is the most well-known and widely-used fuzzy clustering algorithm. However, one of the weaknesses of the FCM is the way it assigns membership degrees to data which is based on the distance to the cluster centers. Unfortunately, the membership degrees are determined without considering the shape and density of the clusters. In this paper, we propose an algorithm which takes th...
متن کاملHybrid ANFIS with ant colony optimization algorithm for prediction of shear wave velocity from a carbonate reservoir in Iran
Shear wave velocity (Vs) data are key information for petrophysical, geophysical and geomechanical studies. Although compressional wave velocity (Vp) measurements exist in almost all wells, shear wave velocity is not recorded for most of elderly wells due to lack of technologic tools. Furthermore, measurement of shear wave velocity is to some extent costly. This study proposes a novel methodolo...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره 2016 شماره
صفحات -
تاریخ انتشار 2016